Tagging for Learning: Collecting Thematic Relations from Corpus
نویسندگان
چکیده
Recent work in text analysis has suggested that da ta on words tha t frequently occur together reveal important information about text content. Co-occurrence relations can serve two main purposes in language processing. First, the statistics of co-occurrence have been shown to produce accurate results in syntactic analysis. Second, the way that words appear together can help in assigning thematic roles in semantic interpretation. This paper discusses a method for collecting co-occurrence data, ~qu i r ing lexical relations from the data, and applying these relations to semantic analysis.
منابع مشابه
Learning Thematic Role Relations for Wordnets
In this paper, I present a method for learning thematic role relations (selectional preferences) for wordnets by means of statistical corpus analysis. An evaluation on a gold standard, which I extracted from EuroWordNet, shows that this method achieves a learning accuracy of up to 77%. I also propose a preprocessing step for a partial lexical disambiguation of the input data. This disambiguatio...
متن کاملThe language of collaborative tagging
Collaborative tagging is the process whereby people attach keywords, known as tags, to digital resources, such as text and images, in order to render them retrievable in the future. This thesis investigates how tags submitted by users in collaborative tagging systems function as descriptors of a resource’s perceived content. Using computational and theoretical tools, I compare collaborative tag...
متن کاملMassively parallel learning of part-of-speech disambiguation
This paper presents a method for massively parallel learning of part-of-speech disambiguation based on a minmax modular neural network model. The method has three main steps. Firstly, a large-scale tagging problem is decomposed into a number of relatively smaller and simpler subproblems according to the class relations among a given training corpus. Secondly, all of the subproblems are learned ...
متن کاملLearning "Generalization/Specialization" Relations between Concepts - Application for Automatically Building Thematic Document Hierarchies
We introduce a new method for automatically constructing concept hierarchies where the concept nodes follow a generalization / specialization relation. Starting from a set of concepts automatically extracted from a corpus, we show how to learn generalization / specialization relations between couples of concepts and how this leads to the construction of the hierarchy. We present an application ...
متن کاملPart-of-Speech Tagging for Code-Mixed English-Hindi Twitter and Facebook Chat Messages
The paper reports work on collecting and annotating code-mixed English-Hindi social media text (Twitter and Facebook messages), and experiments on automatic tagging of these corpora, using both a coarse-grained and a fine-grained part-ofspeech tag set. We compare the performance of a combination of language specific taggers to that of applying four machine learning algorithms to the task (Condi...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1990